Longitudinal analysis of epigenome-wide DNA methylation reveals novel loci associated with BMI change in East Asians

Background Obesity is a global public health concern linked to chronic diseases such as cardiovascular disease and type 2 diabetes (T2D). Emerging evidence suggests that epigenetic modifications, particularly DNA methylation, may contribute to obesity. However, the molecular mechanism underlying the longitudinal change of BMI has not been well-explored, especially in East Asian populations. Methods This study performed a longitudinal epigenome-wide association analysis of DNA methylation to uncover novel loci associated with BMI change in 533 individuals across two Chinese cohorts with repeated DNA methylation and BMI measurements over four years. Results We identified three novel CpG sites (cg14671384, cg25540824, and cg10848724) significantly associated with BMI change. Two of the identified CpG sites were located in regions previously associated with body shape and basal metabolic rate. Annotation of the top 20 BMI change-associated CpGs revealed strong connections to obesity and T2D. Notably, these CpGs exhibited active regulatory roles and located in genes with high expression in the liver and digestive tract, suggesting a potential regulatory pathway from genome to phenotypes of energy metabolism and absorption via DNA methylation. Cross-sectional and longitudinal EWAS comparisons indicated different mechanisms between CpGs related to BMI and BMI change. Conclusion This study enhances our understanding of the epigenetic dynamics underlying BMI change and emphasizes the value of longitudinal analyses in deciphering the complex interplay between epigenetics and obesity. Supplementary Information The online version contains supplementary material available at 10.1186/s13148-024-01679-x.


Supplementary Texts
The linear mixed model The linear mixed model (LMM) is a powerful statistical tool used to analyze data collected over time from the same individuals or subjects, which can account for withinsubject correlations and heterogeneity in the data, while also allowing for the estimation of fixed and random effects.In cohort 1, we conducted the LMM analysis using the "lme4" package in R [1], which specified the relationship between the outcome variable and the predictor variables, including fixed effects and random effects.The formula of LMM can be denoted as, where   is the methylation for  th subject,   the BMI for  th subject at baseline,   and   the age and sex of  th subject, and ( )  includes the predicted percentages of B cells, CD4+ and CD8+ T cells, NK cells, monocytes and neutrophils. 0 is the random intercept modelling baseline individual heterogeneity,  1 the random slope modelling individual heterogeneity in the relationship, where both  0 and  1 are assumed Gaussian distribution.
We compared the EWAS results of cohort 1 using LLM with the cross-sectional results using traditional regression models, and observed similar EWAS results (Supplementary Figure 11).The Pearson's correlation coefficient (PCC) between EWAS results calculated using LMM and the EWAS results of baseline BMI was 0.84 (two-sided t-test P = 3.37×10 -284 ), while the PCC between EWAS results using LMM and the EWAS results of follow-up BMI was 0.86 (two-sided t-test P = 1.11×10 -212 ).

A secondary EWAS model adjusting for smoking and drinking
We built a secondary model to include smoking and drinking as confounders.The formula of the secondary model is as follows: where   is the change of methylation for  th subject,   the continuous value of BMI change for  th subject at baseline,   and   the age and sex of  th subject,   and   the status of  th subject, and ( )  includes the predicted percentages of B cells, CD4+ and CD8+ T cells, NK cells, monocytes and neutrophils.
On the basis of the baseline model, the secondary model additionally adjusted for smoking and drinking status.The secondary model identified the same CpGs (cg14671384, cg25540824, and cg10848724) as the baseline model at the threshold of P < 1×10 -6 .We further compared the effect sizes in EWAS results of these two models.
Results showed that the effect sizes of the secondary model adjusting for drinking and smoking was highly consistent with the effect size of our baseline model (Pearson's correlation coefficient = 0.997; Supplementary Figure 1), indicating that the EWAS of BMI change was not sensitive to smoking and drinking.

DMR analysis
We conducted the differential methylation region (DMR) analysis using the R package DMRcate [2], to detect genomic regions with differential DNA methylation patterns correlated with BMI change.First, we identified differential methylation positions (DMPs) using the "cpg.annotate"function of DMRcate.Then, DMRcate used a spatial kernel smoothing approach to model DNA methylation levels across the genome and conducted hypothesis tests to determine DMR regions.The identified DMRs were contiguous genomic regions with consistent methylation differences and were assigned with significance scores based on statistical testing results.We limited DMRs to contain at least three CpG signals to ensure the confidence of the identified regions, as in [3,4].
Using the methods described above, we identified a DMR of "chr20:57427472-57427713" located near GNAS.GNAS contains a differentially methylated region (DMR) at the 5' exons, which is commonly found in imprinted genes and correlates with transcript expression.Because of imprinting, mutations on the maternal allele of GNAS can cause obesity and hormone resistance (pseudohypoparathyroidism) [5].
However, the CpGs identified by the longitudinal EWAS analysis did not fall into this DMR.DMR analysis can be considered as a supplementary to the DMP analysis.

Comparison of EWAS of BMI change in different populations
Demerath et al conducted a EWAS analysis of BMI change with 2097 African American adults in the Atherosclerosis Risk in Communities (ARIC) study as the discovery cohort and 2377 White adults in the Framingham Heart Study as the replication cohort [6].In the study, 8 CpGs (cg15871086, cg09554443, cg26403843, cg07136133, cg13123009, cg00574958, cg03546163, and cg16672562) were identified to be significantly associated with BMI change.According to their analysis, the CpGs identified in American adults were mainly near genes involved in lipid metabolism, immune response/cytokine signaling and other diverse pathways.We compared the CpGs identified in American population with those identified in Asian population and found no overlap between the two studies.Besides, the top CpGs identified in Asian population in negative regulation of protein phosphorylation and cell migration, which can both be induced by growth factor and plays important roles in the development of body height and obesity [7,8].The difference between CpGs identified in different populations may reflect underlying genetic variation that is specific to each population.
Differences in diet, physical activity, socioeconomic status, and other environmental factors may lead to population-specific associations between CpG sites and BMI change.Overall, differences in CpGs identified in different populations highlight the complex interplay between genetic, environmental, and epigenetic factors in shaping BMI-related phenotypes and underscore the importance of considering populationspecific factors in epigenome-wide association studies.